First and foremost, I will make a quickly analyse of brazilian market. This analyse will help me to choose which direction will be better to follow, as I have a short time to try as many way possible. In the firs moment we just have done a data mining to preprer the data for forecast analyse.
Remark about how I create group of Avarage Pack Size, I used a boxplot to set the size of each group.
The first graphic bellow ,volume and value normalized, helps us to undestand the moviment of market. In 2016, the volume market sharply drecrease more than the value market, even drecrease less than before as you can note the blue line slope. One of my issue after June 2016 is the price may impact the volume, the red line is almost flat.
For a confirm conclusion about the price, we can note in the 2 chart beloow the size of point which mean the price, did not change pasting time.
A quick undestand which Flavor has more impact in the market, easly notice in the graphic bellow the Flavor Milk Chocolate has a huge impact in the market result. So could be interst do different analyses, a cluster of Milk Chocolate and a cluster of another Flavors. Spliting in two cluster and zoom the second one, we can see cherry flavor follow by coffe flavor are most predominant.
Now, you can check witouth be normalized. I had to use 3 different graph because the range of values is large, so if try to plot in a single chart, woulb clear the information.
Analysing the material market, as easy to realise the plastic predominant, the clomun chart on the left is the value market and on the right volume market by material package. In the end of 2017 seems exist a inverse correlation or just a casuality because the campaing agains plastics are inrease nowadays.
The caloric content always have been dominated by sugar, the graphic bellow can prove this sentence.
The market has two type of size package that lead the other.
After understand the market, we can set the categories more influencer for run the forecast model, is usual help in the computer performance because will work with less data.